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We assess the accuracy with which future galaxy surveys can measure cosmological parameters. By 
breaking parameter degeneracies of the Planck Cosmic Microwave Background satellite, the Sloan 
Digital Sky Survey may be able to reduce the Planck error bars by about an order of magnitude 
on the large-scale power normalization and the reionization optical depth, down to percent levels. 
However, pinpointing attainable accuracies to within better than a factor of a few depends crucially 
on whether it will be possible to extract useful information from the mildly nonlinear regime. 


I. INTRODUCTION 

One of the main challenges in modern cosmology is 
to refine and test the standard gravitational instability 
model of structure formation by precision measurements 
of its free parameters: the slope n and normalization Q 
of the primordial spectrum of density fluctuations, the 
densities of various types of matter, etc. A seminal pa¬ 
per in this journal Jl| recently showed that future cos¬ 
mic microwave background (CMB) experiments such as 
the MAP and Planck satellites would revolutionize this 
endeavor, allowing the simultaneous determination of a 
dozen parameters to hitherto unprecedented accuracies. 
This prompted several more detailed studies 1 I , which 
confirmed this optimistic conclusion. 

A parallel effort towards precision cosmology is larger 
and more systematic galaxy redshift surveys. The 
largest currently available three-dimensional surveys con¬ 
tain about 25,000 galaxies. The 2dF survey (described in 
Q ) will measure ten times as many, and the Sloan Digital 
Sky Survey (SDSS) is scheduled to acquire a million red- 
shifts within five years ]|||. It is therefore quite timely 
to perform an analogous first assessment of the ability to 
measure cosmological parameters with large galaxy sur¬ 
veys. This is the purpose of the present Letter. 


II. METHOD 

The accuracy with which cosmological parameters can 
be measured from a given data set is conveniently com¬ 
puted with the Fisher information matrix formalism (see 
jij for a comprehensive review). In our case, the data 
set can be viewed as an Mdimensional vector x, whose 
components X; are the fluctuations in the galaxy density 
relative to the mean in N disjoint cells that cover the 
three-dimensional survey volume in a fine grid, x is mod¬ 
eled as a random variable whose probability distribution 
/(x; 0) depends on a vector of cosmological parameters 
0 that we wish to estimate (for instance, we might have 
6*1 =71,62 = Q , etc.). The Fisher matrix is defined by 


and its inverse F -1 can, crudely speaking, be thought 
of as the best possible covariance matrix for the mea¬ 
surement errors on the parameters. The Cramer-Rao in¬ 
equality H shows that no unbiased method whatsoever 
can measure the i th parameter with error bars (standard 
deviation) less than 1 /\/F, t . If the other parameters are 
not known but are estimated from the data as well, the 
minimum standard deviation rises to (F _1 )|/ 2 . 


A. The brute force approach 


In the approximation that the probability distribution 
/ is a multivariate Gaussian with mean fi = (x) and 
covariance matrix C = (xx 4 ) — /r/T, eq. (fli) becomes 
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This equation was employed in all the above-mentioned 
papers on CMB parameter determination, since for an 
all-sky CMB map, the covariance matrix C can be di¬ 
agonalized by a spherical harmonic expansion, making 
the computation of F numerically trivial. For our galaxy 
survey case, the situation is more difficult. The analog 
of the CMB trick (a Fourier transformation) does not di¬ 
agonalize C, since only a finite spatial volume is probed. 


B. A useful approximation 

Since brute force application of eq. (j2j) tends to obscure 
the underlying physics, we will now derive a simple ap¬ 
proximation for F below, which allows a more intuitive 
understanding of numerical results and shows the relative 
information contribution from different scales k. Ignoring 
redshift-space distortions and non-linear clustering, all 
the cosmological information is contained in the galaxy 
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power spectrum P(k). In the limit where the survey vol¬ 
ume is much larger than the scale of any features in P(k), 
it has been shown |nj that all the cosmological informa¬ 
tion in x is recovered when P{k) is estimated with the 
FKP method |jll|. Let us therefore redefine x n to be not 
the density fluctuation in the n th spatial volume element, 
but the average power measured with the FKP method 
in a thin shell of radius k n in Fourier space, with width 
dk n and volume element V n = 47rfc^dfc„/(27r) 3 . With our 
notation, we can rewrite the FKP results as 


Mn ~ P{k n ), 
i o P(k n )P(k n ) 
V n V eff (k n ) 


where 


Veff(k) = 


n(v)P{k) 

1 + n(r)P(k) 
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Here h(r) is the selection function of the survey, which 
gives the a priori expectation value for the number den¬ 
sity of galaxies. V e ff(k) can be interpreted as the effec¬ 
tive volume utilized for measuring the power at wavenum¬ 
ber k, since the integrand will be of order unity in those 
regions where the cosmic signal P(k) exceeds the Pois- 
sonian shot noise 1 /n, and typically gives only a small 
contribution from other regions. For a volume-limited 
survey, n is constant in the observed region, so V e // (and 
hence the Fisher matrix) is simply proportional to the 
survey volume. 

Choosing the shells thick enough to contain many un¬ 
correlated modes each, V n V e ff(k n ) 1, the central limit 
theorem indicates that x will be approximately Gaussian. 
In the same limit V n V e ff{k n ) 1, the second term in 
eq. (|j) will be completely dominated by the first 121, so 
substituting equations (||) and (|J) into eq. (]|) gives 


(outside of which n = 0) enter only via the weight func¬ 
tion w(k ), which is essentially the number of independent 
modes of wavelength A that fit into the volume probed 
(V e ff ). The top panel of Figure 1 shows the weight func¬ 
tion for the main northern part of the SDSS p] and for 
the SDSS bright red galaxy (BRG) sample. The latter is 
assumed to be volume-limited at 1000 6 _1 Mpc, contain¬ 
ing 10 5 galaxies with a bias factor 6 = 2, 

We close this section by emphasizing that eq. (j7|) is 
a rather crude approximation, since it ignores edge ef¬ 
fects, redsliift space distortions and, most importantly, 
non-linear clust ering. We will return to the last issue 
in Section III B| . To quantify the edge effect errors, we 
have tested eq. (i?j) numerically by brute force manipu¬ 
lations of the N x N matrices of eq. (j|) for a number 
of cases with N ~ 10 4 , and find that it is typically ac¬ 
curate to within a factor of two for a cold dark matter 
(CDM) power spectrum when the survey size 200/U 1 
Mpc. The differences have two sources, with opposite 
sign, which both grow in importance if we decrease the 
survey volume: 


1. The effective number of modes probed is slightly 
larger than V e ff indicates, since the density field 
just inside the survey volume is correlated with that 
just outside. This reduces error bars. 

2. The measured power spectrum is effectively 
smoothed on the scale of the survey volume, which 
can destroy information on the small k behavior 
of the power spectrum and on sharp features and 
wiggles. This increases error bars. 


III. RESULTS AND CONCLUSIONS 
A. A linear clustering example 
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Replacing the sum by an integral and using d In P = 
dP/P , this reduces to the handy approximation 
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w(k)d In k , 
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where we have defined 


»(*) = (8) 

and the wavelength is A = 27 t /k. Eq. ( 0 ) conveniently 
separates the effects of cosmology from those of the 
survey-specific details. The former enter only through 
the logarithmic derivatives d\wP/d9i, which are plotted 
in Figure 1 for some simple examples. The selection func¬ 
tion h and the geometric bounds of the survey volume 


Before discussing realistic non-linear power spectra, we 
will now highlight some of the features of eq. ( 0 ) with a 
simple linear power spectrum example. Let us consider 
a CDM power spectrum of the form 

P{k) = Q 2 {r ] k/K) n T{r ] k) 2 . (9) 


On a log-log plot such as Figure 1 (top panel), varying the 
normalization Q shifts the spectrum vertically, whereas 
varying the parameter ?? shifts it horizontally. We chose 
fc* = 0.025/iMpc _ , roughly the scale where P takes its 
maximum, so varying n tilts the spectrum about its peak. 
The transfer function T is computed numerically with the 
CMBFAST software [[f3| for a Hubble constant h = 0.5, 
baryon fraction = 0.06, CDM fraction = 0.48, and 
vacuum density (relative cosmological constant) kl v = 
0.46, chosen to be virtually indistinguishable from a Bond 
& Efstathiou model fit 1141 with “shape parameter” F = 
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hfl c = 0.25. For our fiducial model, n = r] = 1 and Q is 
such that the 8 /i -1 Mpc normalization is as = 1. 

Partial derivatives needed for eq. (|7[) are plotted in the 
second panel of Figure 1. dlnP/dlnQ = 2, din P/dn = 
In(fc/fc*), and dlnP/dlmj = dinP/dink, simply the log¬ 
arithmic slope of the power spectrum, ranging from +1 to 
—3 and vanishing at the peak (together with <91n P/dn). 
The dependence on all other parameters 0i enters via the 
transfer function. Figure 1 shows only one such example: 
the baryon fraction fib. 

1. Single-parameter accuracy 


(the functions dlnP/ddi), where the inner product is de¬ 
fined by the weight function w. If any of the functions in 
the second panel can be written as a linear combination 
of some others, then F will clearly be singular, and the 
errors on the corresponding parameters will be infinite. 
For instance, d In P/d In T) and din P/dn are essentially 
degenerate for k max < O.lfiMpc ^ 1 (they both look like 
straight lines vanishing at k = k *, and the curvature of 
d In P/d In 77 at k < 0.01/iMpc -1 is irrelevant since these 
scales receive so little weight), which is why In i] and n 
have such large uncertainties in the bottom panel until 
d In P/d In r/ bends downward and breaks this near de¬ 
generacy at k ~ O.lftMpc -1 . 


The third panel in Figure 1 shows the error bars 
Adi = 1 /F , 1 / 2 on each parameter that would result from 
the SDSS BRG survey if the true values of all other 
parameters where known, as a function of the upper 
limit of integration k max , with k m i n = 0. As eq. ([|) 
shows, the information F,;, on a parameter is simply the 
square of the corresponding curve in the second panel, 
integrated against the weight function in the top panel. 
For instance, there is no information about fib on scales 
k <C fc*, since the physical impact of baryons on fluc¬ 
tuation growth is different from that of CDM only on 
scales entering the horizon before matter and radiation 
decouple at 2 ~ 10 3 |li|. Also, we see that the bulk of 
the information on fib is coming not from the charac¬ 
teristic baryon-induced acoustic oscillations (wiggles) in 
the transfer function, but from the overall suppression 
of power rightward of the peak. Although the wiggles 
help somewhat in breaking parameter degeneracy (dis¬ 
cussed below), this can be somewhat misleading, since all 
but perhaps the first oscillation are likely to have been 
smeared out by mode coupling as the clustering goes non¬ 
linear. For a more detailed treatment of the constraints 
on fib, submitted after the present paper, see |hJ. 

How should the limits of integration ( k m i n and k ma x) 
be chosen? Since information on scales comparable to 
and larger than the survey is destroyed by smearing and 
mean removal effects, it is natural to chose /k m i n to be 
of order the survey size. The choice of k max , on the other 
hand, is seen to be of paramount importance, since the k 3 
phase space factor causes w(k) to peak far shortward of 
the power spectrum peak scale fc*, where nonlinear effects 


become important. We defer this issue to Section III B 


2. Degeneracies 

The bottom panel in Figure 1 shows the error bars 
Adi = (F -1 )]/ 2 on each parameter that would result if 
a joint fit to all four parameters were performed, and no 
other constraints ( e.g., from CMB maps) were available 
for the other three parameters. Eq. ( 0 ) can be inter¬ 
preted as F being the dot products of a set of vectors 


B. Non-linear clustering 

Since much of the information on cosmological parame¬ 
ters comes from small scales, non-linear clustering cannot 
be ignored when assessing the attainable accuracy. The 
power spectrum remains a perfectly well-defined quan¬ 
tity even in the deeply non-linear regime. However, the 
density field becomes non-Gaussian, which causes eq. (j^) 
(and hence also eq. ©) to misestimate the Fisher matrix 
in two competing ways: 

1. The variance of the power spectrum estimates tend 
to exceed the value given by eq. ©>> causing us to 
underestimate the parameter error bars. 

2 . Additional cosmological information is contained in 
the higher moments of the distribution, causing us 
to overestimate the parameter error bars. 

Bearing these important caveats in mind, we nonethe¬ 
less apply eq. (jfij), using the analytic fits described in |l7|] 
to compute the relevant nonlinear power spectra. This 
changes the accuracy curves corresponding to Figure 1 
for k O.l/iMpc"^ 1 , but only marginally. A more radical 
change occurs when including the linear bias factor b (the 
ratio of the galaxy fluctuations to the underlying matter 
fluctuations, which we assume to be scale-independent) 
as an additional parameter, since in linear theory, it is 
perfectly degenerate with the large-scale power normal¬ 
ization Q. The top panel of Figure 2 shows the partial 
power derivatives with respect to Q, b, n and ij, and we 
see that nonlinear effects begin to break this degeneracy 
around the scale k ~ 0.1/iMpc -1 . Power spectra with 
wiggles cannot be accurately treated with this nonlinear 
formalism, so we have used the above-mentioned wiggle- 
free Bond & Efstathiou transfer function fit jb4| here to 
be conservative and avoid underestimating error bars. 


C. Combining galaxy surveys and CMB experiments 

So what is the bottom line? How well can future galaxy 
surveys constrain cosmological parameters? Since degen- 
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eracies are crucial, especially when considering joint fits 
to a dozen parameters as in the context of CMB exper¬ 
iments, a sensible answer must clearly take into account 
the degeneracy-breaking information from other sources. 
It has recently been shown [||Q| that CMB experiments 
suffer from a near-exact degeneracy between the spatial 
curvature f 1 and the cosmological constant A (since they 
are virtually unable to distinguish between combinations 
that give the same angle-distance relationship), but this 
degeneracy is likely to be independently broken by both 
supernova and lensing measurements. The second worst 
degeneracy for the Planck satellite links the normaliza¬ 
tion Q to r (the optical depth from reionization), and 
partly also to the scalar-to-tensor ratio. This second 
degeneracy is an example where future galaxy surveys 
have the potential to substantially improve the situa¬ 
tion. The bottom panel of Figure 2 shows the error bars 
on b and Q when n and rj are assumed known, and it 
is seen that an accuracy A Q/Q = 1% is attained for 
A = 2ir/k ~ 18/i _1 Mpc. A fundamental limit on in¬ 
accuracy will probably arise from partial degeneracy with 
the location and slope of the spectrum (?y and n) on small 
scales, so since these parameters can only be measured to 
about 1% by Planck |jj, the Q-accuracy from SDSS will 
at best be of the same order. Hovever, if this accuracy 
is indeed attainable despite the above-mentioned caveats 
regarding nonlinearity, it would be quite a radical im¬ 
provement over the A Q/Q ~ 15% that Planck alone can 
attain [||. By breaking this degeneracy, SDSS would also 
help Planck pin down the other parameters that were 
nearly degenerate with Q. For instance, repeating the 
analysis of |j| with a mere 1% prior uncertainly on Q, we 
find that the error bar on the reionization optical depth 
drops from 0.16 to 0.03, which would make reionization 
detectable at 1 — a as late as z = 8 in a standard CDM 
cosmology. 

In conclusion, we have derived, tested and applied an 
approximate formula for the accuracy with which large 
galaxy surveys can measure cosmological parameters. Al¬ 
though our results indicate that such surveys can sub¬ 
stantially enhance the accuracy attainable from CMB 
measurements alone, a number of issues must be ad¬ 
dressed before quantitative claims should be believed. 

1 . Are current calculations 0 of the non-linear 
power spectrum sufficiently accurate for our appli¬ 
cation (when including the effect of baryons, possi¬ 
ble massive neutrinos, etc.)? 

2. Does the non-Gaussianity of the cosmological den¬ 
sity field on weakly nonlinear scales cause our 
approximation to substantially over- or under¬ 
estimate the attainable accuracy? 

3. Is biasing sufficiently non-linear |l^| on these scales 
to invalidate our results? 


Thus although eq. (0) is in itself a rather crude approx¬ 
imation, the main source of uncertainty lies elsewhere: in 
our ability to model and extract information from clus¬ 
tering in the marginally non-linear regime. The nonlin¬ 
ear domain appears to be a gold mine of cosmological 
information, but one whose riches may prove extremely 
difficult to extract. 
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FIG. 1. The top panel shows the weight functions w(k) for main and BRG samples of the SDSS, together with our fiducial 
linear and nonlinear CDM power spectra in (h _1 Mpc) 3 units. The second panel shows the logarithmic derivatives of the 
linear power spectrum with respect to its amplitude, horizontal location, slope and baryon content. The third panel shows the 
accuracy with which these parameters can be measured using information on wavenumbers up to k = km ax when the other 
parameters are already known, and the bottom panel shows the corresponding accuracies when all four parameters must be 
determined simultaneously. The vertical line indicates k t , the location where P(k) peaks. 
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FIG. 2. Similar to Figure 1, but using nonlinear power spectra. In the bottom plot, all parameters except Q and b are 
assumed to be independently known. 
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